Inducing Probabilistic Syllable Classes Using Multivariate Clustering -gold
نویسندگان
چکیده
An approach to automatic detection of syllable structure is presented. We demonstrate a novel application of EM-based clustering to multivariate data, exempliied by the induction of 3-and 5-dimensional probabilistic syllable classes. The 3-dimensional models were subjected to a pseudo-disambiguation task, the result of which shows that the onset is the most variable, or least predictable, part of the syllable. An extensive qualitative evaluation shows that the method yields phonologically meaningful syllable classes. We then propose a novel approach to grapheme-to-phoneme conversion and show that syllable structure represents valuable information for pronunciation systems.
منابع مشابه
Inducing Probabilistic Syllable Classes Using Multivariate Clustering
An approach to automatic detection of syllable structure is presented. We demonstrate a novel application of EM-based clustering to multivariate data, exempli ed by the induction of 3and 5-dimensional probabilistic syllable classes. The qualitative evaluation shows that the method yields phonologically meaningful syllable classes. We then propose a novel approach to grapheme-to-phoneme conversi...
متن کاملA Step-wise Usage-based Method for Inducing Polysemy-aware Verb Classes
We present an unsupervised method for inducing verb classes from verb uses in gigaword corpora. Our method consists of two clustering steps: verb-specific semantic frames are first induced by clustering verb uses in a corpus and then verb classes are induced by clustering these frames. By taking this step-wise approach, we can not only generate verb classes based on a massive amount of verb use...
متن کاملProbabilistic Landslide Risk Analysis and Mapping (Case Study: Chehel-Chai Watershed, Golestan Province, Iran)
The efficiency of three statistical models, AHP surface-weighted density bivariate (semi-quantitative models), stepwise multivariate regression and logistic multivariate regression models were compared in Chehel-Chai watershed in Golestan province, Iran. In current study the hazard map was prepared according to the top model of landslide hazard map. Chehel-Chai watershed is located as one of Go...
متن کاملSpectral Clustering for German Verbs
We describe and evaluate the application of a spectral clustering technique (Ng et al., 2002) to the unsupervised clustering of German verbs. Our previous work has shown that standard clustering techniques succeed in inducing Levinstyle semantic classes from verb subcategorisation information. But clustering in the very high dimensional spaces that we use is fraught with technical and conceptua...
متن کاملInducing German Semantic Verb Classes from Purely Syntactic Subcategorisation Information
The paper describes the application of kMeans, a standard clustering technique, to the task of inducing semantic classes for German verbs. Using probability distributions over verb subcategorisation frames, we obtained an intuitively plausible clustering of 57 verbs into 14 classes. The automatic clustering was evaluated against independently motivated, handconstructed semantic verb classes. A ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000